Structure and function of the soil microbiome underlying N2O emissions from global wetlands

Wetland soils are the greatest source of nitrous oxide (N2O), a critical greenhouse gas and ozone depleter released by microbes. Yet, microbial players and processes underlying the N2O emissions from wetland soils are poorly understood. Using in situ N2O measurements and by determining the structure and potential functional of microbial communities in 645 wetland soil samples globally, we examined the potential role of archaea, bacteria, and fungi in nitrogen (N) cycling and N2O emissions. We show that N2O emissions are higher in drained and warm wetland soils, and are correlated with functional diversity of microbes. We further provide evidence that despite their much lower abundance compared to bacteria, nitrifying archaeal abundance is a key factor explaining N2O emissions from wetland soils globally. Our data suggest that ongoing global warming and intensifying environmental change may boost archaeal nitrifiers, collectively transforming wetland soils to a greater source of N2O.

D espite covering only 8% of the terrestrial Earth surface, wetland soils (including gley and peat soils) store one of the largest organic carbon (C) stocks. Microbial degradation of C and nitrogen (N) stocks can lead to substantial releases of greenhouse gases (GHGs), including nitrous oxide (N 2 O). N 2 O is a potent GHG with a global warming potential 265 times that of CO 2. N 2 O is the most important ozone-depleting substance 1 . This is particularly alarming as microbial sources of N 2 O may shift with environmental changes. Wetland soils are increasingly subject to land-use changes such as afforestation and transformation to agricultural land, both preceded by drainage, with long-term consequences for N 2 O emissions 2 . To reduce N 2 O emissions from wetland soils, we need a thorough understanding of biogeochemical pathways and critical environmental parameters, which shape the microbial activities underpinning the N cycle and N 2 O dynamics.
Microbial processes such as classical denitrification, nitrifier denitrification, and dissimilatory nitrate reduction to ammonia (DNRA) all contribute to N 2 O production mainly in anoxic conditions 3 . By contrast, ammonia oxidation, which is the first step in nitrification, is an aerobic process performed by three groups of ammonia oxidizing microorganisms: canonical ammonia oxidizing bacteria (AOB), ammonia oxidizing archaea (AOA), and complete ammonia oxidizers (comammox Nitrospira). AOA not only directly produce N 2 O, but also provide substrate for denitrification 4 . Yet, little is known about the environmental conditions that favor each process and thereby N 2 O production and consumption. AOA may play a pivotal, underexplored role in fueling denitrification and facilitating terrestrial N 2 O emissions 5 in many soil environments.
Here we analyzed 645 wetland soils ( Fig. 1a; Supplementary Data 1) to determine how the structure and function of microbial communities contribute to N 2 O emissions. Our unique dataset integrated global-scale analysis of functional metagenomes (to estimate relative abundance of N-cycle genes independently of PCR biases), multi-group metabarcoding (bacterial 16S, archaeal 16S, fungal 18S-ITS rRNA genes), absolute quantification of N-cycle gene abundances, as well as in situ N 2 O flux and ex situ potential N 2 production analyses. We further leveraged available genomics data to understand genetic mechanisms underlying N 2 O production. We hypothesized that the high N 2 O production in global wetland soils is mainly related to the diversity and abundance of nitrifying microbes, and that archaeal nitrifiers, both in terms of absolute and relative abundance to denitrifiers, are the most robust and accurate explanatory factor of N 2 O emissions from wetland soils globally.

Results and discussion
Global patterns of N 2 O fluxes. Our analysis indicated that warmer soils and more intensive land use progressively may enhance N 2 O release from wetland soils. N 2 O emissions showed exponentially increasing relationships with temperature of the warmest month ( Supplementary Fig. 1). In addition, the N 2 O emissions were strongly explained by land-use type (r 2 adj = 0.364, p < 0.001), with greatest values in the bare soils and lowest in the forest soils ( Supplementary Fig. 2). Assessment of environmental determinants of N 2 O fluxes revealed that N 2 O emissions decline towards higher latitudes (Fig. 1, Supplementary Fig. 1). Contrary to the N 2 O emissions, potential N 2 production peaked in the temperate climate in negative correlation with land-use intensity ( Supplementary  Fig. 3). In agreement with our findings, a recent local warming experiment 6 and global models 2 predict an increase in N 2 O production in response to warming across various ecosystems.
Relationships of global N 2 O fluxes to microbial diversity and taxa. Our analyses of microbial communities of wetland soils revealed that, like the increasing N 2 O emissions towards the equator (Fig. 1b), archaeal diversity significantly increased towards low latitudes ( Supplementary Fig. 4a). By contrast, mid-latitude wetland soils harbored the highest bacterial diversity, whereas fungal diversity showed no significant relationships with latitude but peaked at mean annual temperature of 10-15°C (Supplementary Fig. 4). Across all associations among archaea, bacteria, and fungi of the wetland soils, climate and soil variables had the greatest impact on microbial diversity ( Supplementary Fig. 5). General linear models combined with machine learning techniques indicated that archaeal diversity was best explained by soil C/N ratio, which agrees with a previous study on mineral soils 7 . Soil pH was the primary determinant of bacterial diversity ( Supplementary  Fig. 5) and relative abundance of the most common bacterial phyla ( Fig. 2; Supplementary Fig. 6), whereas fungal diversity showed a weak relationship with environmental factors ( Supplementary  Fig. 5). These results corroborate those from mineral soils, where bacteria show stronger environmental associations than fungi and warm temperate regions harbor the highest bacterial diversity 8 . In addition, soil pH constitutes the main determinant of bacterial diversity in the mineral soil microbiome 8,9 . To determine the main microbial groups associated with N 2 O emission in global wetland soils, we related N 2 O fluxes with the relative abundance of various microbial lineages based on 16S and 18S rRNA gene metabarcoding. The microbial phyla Proteobacteria, Acidobacteriota, and Chloroflexi are the most abundant globally (Fig. 2). However, these groups were not significantly associated with N 2 O fluxes (p > 0.05), whereas the relative abundance of AOA from the phylum Thaumarchaeota emerged as the most strongly correlated group with N 2 O emission (Fig. 3). This is in agreement with a previous study on arctic peat soils, where the contribution of ammonia oxidizing archaea to N 2 O flux was confirmed by group-specific ammonia oxidation inhibitors as well as molecular approaches 10 . A previous study also reports a strong association between the thaumarchaeal 16S rRNA and amoA genes in environmental samples 11 . We also found that among all prokaryotic and eukaryotes genera uncovered in metagenomics data, the Soil Crenarchaeotic Group (SCG) showed the strongest positive correlation with N 2 O emissions (Supplementary Data 2). Furthermore, of the total 620 archaeal OTUs uncovered by a long-read sequencing technology (PacBio) occurring in >5 sites, 11 OTUs (including 5 in the order Nitrososphaerales, which are confirmed ammonia oxidizers; Supplementary Data 3) showed positive correlations (r > 0.35, q < 0.2) with N 2 O emission. Of these, N 2 O fluxes showed the strongest correlation with the relative abundance of OTUs most closely associated with 'Candidatus Nitrosotenuis chungbukensis MY2' (r = 0.488, p < 0.001) and 'Candidatus Nitrosocosmicus oleophilus MY3' (Spearman's rank-correlation r = 0.477, p < 0.001). Both taxa produce N 2 O in pure culture 12 . In agreement with our study, a previous study found that in arctic peatlands N 2 O emission was driven by only two OTUs of Thaumarchaeota, one of which was closely affiliated to 'Ca. N. oleophilus MY3' 10 . Ammonia oxidizing archaea play a key role in nitrification in unfertilized soils and soils with low ammonia concentrations 13 . In addition, in unfertilized soils, nitrite and nitrate may be predominantly made available for denitrifiers through nitrification, making nitrification a limiting factor for denitrification.
Metagenomic analysis of pathways underlying global N 2 O fluxes. To investigate functional pathways contributing to N 2 O emission, we examined clusters of orthologous gene groups (OGs) using metagenomes (see the "Methods" section). Among all potential key genes involved in N 2 O emission from archaea, the relative abundance of the archaeal amoA (ENOG411114F) showed the strongest correlation with N 2 O (r = 0.625, p < 0.001; Fig. 3), followed by an OG with unknown functions (Supplementary Data 4). To further evaluate the genetic basis facilitating N 2 O emission, we compared the nucleotide sequences of the archaeal OTUs correlating with N 2 O emission with those showing no such correlation. Using BlastN searches of 16S rRNA gene reads against complete archaeal genomes, we located the closest genomesequenced relatives and obtained the corresponding genomic functional profiles. Based on these, we found that the aerobic ammonia oxidation pathway was restricted to four archaeal genera belonging to Thaumarchaeota-Nitrososphaera, Nitrosocosmicus, Nitrosotenuis, and Nitrosarchaeum (Supplementary Data 5). A strong association between the archaeal amoA gene abundance and N 2 O emission occurred across both natural and disturbed sites. Soil nitrate (NO 3 − ) content was also strongly correlated with the relative abundance of archaeal amoA (r = 0.551, p < 0.001). Comparative genomics analysis further revealed that archaea were more enriched in aerobic ammonia-oxidizing pathways compared with bacteria (5.3% vs 0.3%; Supplementary Data 6-8). Overall, our results support the potential key role of Thaumarchaeota in N 2 O emissions from wetland soils globally.
While the pathways and enzymes involved in thaumarchaeal N 2 O production are not fully understood, it has been suggested that AOA can produce N 2 O through both nitrosating hybrid formation and enzymatic denitrification 12,14 . Jung and colleagues proposed that 'Ca. N. oleophilus MY3' has a denitrification capacity using the putative cytochrome P450 NO reductase, homologs of which are present in other representatives of the genus Nitrosocosmicus 12 . However, ammonia oxidizing archaea lacking these homologs are also able to produce N 2 O 12,14 . Further studies are needed to establish the mechanisms behind thaumarchaeal N 2 O production.   Supplementary Fig. 7). The relative increase in archaeal nitrifiers compared to denitrifiers in lower latitudes coincided with the greater N 2 O emissions in these regions (Fig. 1). The absolute archaeal amoA abundance was slightly higher than the bacterial amoA abundance (qPCR: F = 6.00, p = 0.015), substantiating the importance of archaea in nitrification across wetland soils ( Supplementary Fig. 8), as previously reported for grassland and agricultural soils 15 . Our results also corroborate a local-scale metatranscriptomics study in mineral soils 16 , suggesting that archaea predominate over bacteria for ammonia oxidation in soils.
Other major genes involved in the N cycle, including those known to be involved in N 2 O production, were surprisingly of limited importance in explaining N 2 O emission (Fig. 3, Supplementary Fig. 7). The correlation between comammox amoA and N 2 O emission was expectedly weak, which may be related to the apparent adaptation of comammox Nitrospira to low ammonia or because comammox Nitrospira produce relatively small quantities of N 2 O 17,18 . Furthermore, the absolute abundance of comammox amoA was lower than that of archaeal amoA ( Supplementary Fig. 7). In addition, the abundance of reads related to anaerobic ammonium oxidation (anammox) and the nitrite/nitrate-dependent anaerobic methane oxidation (n-damo) did not correlate with the N 2 O fluxes. The abundance of nosZ genes, which encode the nitrous oxide reductase enzyme that consumes N 2 O, was positively correlated with N 2 O emission ( Supplementary Fig. 7). The denitrification genes responsible for N 2 O production (nirK and nirS) showed weak or no correlation with N 2 O emission ( Supplementary Fig. 7). The abundance of nir genes was strongly correlated with that of nosZ genes (Fig. 4a) that reduce N 2 O into inert N 2 . This consumption may explain Fig. 2 Environmental predictors of major archaeal, bacterial and fungal phyla (class for Proteobacteria) across the global wetland soils. The relative abundance data are based on the relative abundance of SSU rRNA genes (normalized by total SSU rRNA abundances per sample) as revealed by shotgun metagenomics (n = 74 independent sites). Boxes represent 25th-75th percentile of the data distribution with whiskers at 1.5 × the interquartile range and the middle line representing median. The size of circles corresponds to the partial importance based on Random Forest models (variability% of mean decrease in accuracy estimated based on out-of-bag-CV); blue and red depict negative and positive Spearman correlations, respectively (n = 74 independent sites). Archaeal and fungal phyla names are indicated in blue and red colour, respectively. The abbreviations are organic matter (OrM), pH (soil pH), C/N (carbon to nitrogen ratio), Ca (calcium), K(potassium), P (phosphorous), Mg (magnesium), and Von Post grade of decomposition (VPG). low N 2 O emissions from the soils enriched with nir genes. Potential N 2 production, however, was not significantly correlated with nosZ abundance (Fig. 4a). In addition, the set of genes associated with denitrification may vary in different species, and not all denitrifiers possess all genes related to this process 19,20 . Soil pH and organic carbon concentration may also affect the amount of N 2 O produced from denitrification ( Supplementary  Fig. 5). Nevertheless, denitrifiers may be more metabolically versatile than nitrifiers and use a range of compounds for both energy and respiration, which is reflected in their weaker environmental associations (Supplementary Figs. 2,5,9). There have been many previous attempts to correlate denitrification genes with soil N 2 O fluxes; whilst some studies have found a good correlation 21 , others have not [22][23][24][25] .
Next, we related N 2 O emissions with the diversity of all major genes involved in the N cycle (based on their absolute abundances quantified by qPCR) and found higher N 2 O emissions with increasing diversity of N cycle functional genes (Fig. 4c). The b Relationship between site mean relative abundance of archaeal amoA and N 2 O emission (n = 74 independent sites). The relative abundance of archaeal amoA was determined based on the relative abundance of metagenomics reads assigned to ENOG411114F (extracted from Hellinger transformed abundance matrix of archaeal OGs). The inset numbers represent a Spearman rank correlation coefficient (r) and corrected p-value for multiple testing using Benjamini-Hochberg method (q). Error bars represent the standard errors (SE) of the site means. c Partial least-squares regression (PLS regression) plot showing the relationships among the relative abundances of prokaryotic taxonomic groups (as determined by 16S metabarcoding) and N 2 O emission (n = 74 independent sites). Blue lines represent archaeal phyla. The statistical test used was two-sided.  26 , where nitrifiers may contribute to the generation of nitrate required for denitrification 27 . Variability of environmental conditions governed by water level dynamics has been shown to determine the diversity of microbes that affect N 2 O fluxes in wetland soils 28,29 . However, we found that the effect of soil factors (C/N ratio and pH) and temperature on functional gene diversity in our dataset (collectively explaining 57% of the variation; Supplementary Figs. 5, 9, 10) exceeded that of soil water content. Among the studied functional genes, the abundance of archaeal amoA correlated the best to temperature and C/N ratio ( Supplementary  Fig. 5), similarly to the N-cycle gene diversity. Contrary to our expectation, taxonomic diversity of microbes showed no correlation with N 2 O emissions (p > 0.05). Previous studies have shown conflicting results on the relationship between microbial diversity and N 2 O emissions, as reviewed in ref. 30 . The decoupling of taxonomic and functional diversities may be due to functional redundancy in microbial taxa active in the N cycle 31 .
Environmental determinants of N 2 O related microbial communities. We explored the environmental conditions favoring microbial taxa and genes driving N 2 O emission. The archaeal amoA displayed a unimodal relationship with mean annual air temperature peaking around 20°C (r 2 adj = 0.255, p < 0.001; Supplementary Fig. 9). This supports earlier findings of greater AOA activities in warmer seasons 32 . The strong positive correlations of the AOA/AOB ratio, mean annual air temperature and soil temperature ( Supplementary Fig. 11) is in line with the relatively high temperature optimum of AOA 33 . This finding suggests that elevated (>15°C) soil temperature in combination with optimal soil moisture 31 may promote N 2 O emissions from soils due to an increased AOA abundance. Nevertheless, how this may be offset by their altered balance with AOB remains to be determined.
Implications for predicting global N 2 O fluxes. Our analyses indicate that both the structure and function of wetland soil microbiome and climatic conditions determine N 2 O fluxes globally. Considering the combined effect of optimal soil moisture and temperature, archaea are important contributors to N cycling in drained wetland soils 34 . Furthermore, we provide evidence that archaeal abundance is a key factor associated with ammonia oxidation pathway that underlies N 2 O emission in wetland soils globally. Our results complement previous findings on the major role of archaea in N 2 O emission in alpine soils 35 and oceans 36 . In particular, the global distribution of AOA and their adaptation to low oxygen and ammonia concentrations may be suggestive of the substantial role of this microbial group in N cycling of wetland soils.
Taken together, our results suggest that nitrifying microbes may contribute more strongly to N 2 O emission than previously thought, and that the diversity of microbes involved in the N cycle may be the integral predictor of N 2 O emissions. To determine the mechanisms underlying global N 2 O emissions, we need to understand the relative role of nitrification and denitrification across a broad variety of habitat types as well as the effects of climate, vegetation, and land use. We predict that future drainage and warming of wetland soils will have negative consequences for regulating ecosystem services of wetlands through accelerating archaeal nitrification that increases substrate availability for denitrification, which collectively promote N 2 O emission. Although we could not distinguish cause and effect, our study generates insights into nitrogen cycling and microbial drivers of N 2 O emission in wetlands.

Methods
Study sites and sampling. We sampled gas and soil in 29 regions throughout the A (rainy tropical), C (temperate), and D (boreal) climate types of the Köppen classification from six continents during the vegetation period between August 2011 and June 2018, following a standard protocol 26 . According to the protocol, the gas and soil samples were collected from locations in public domain or in previous agreement with the local community and/or property owner. The samples were exported from the origin countries and imported to Estonia, EU in cooperation with customs officers of the respective states, following the legal provisions of soil export and import, specifically exemptions for scientific purposes. To capture the full range of environmental conditions in each region, we established 76 wetland soil sites under different vegetation (mosses, sedges, grasses, herbs, trees, and bare soil) and land-use types (natural-raised bog, fen, and forest; agricultural -arable, hay field and pasture; and a peat extraction area) ( Fig. 1a; Supplementary Data 1). We used a four-grade land-use intensity index to quantify the effect of land conversion: 0, no agricultural land use (natural mire, swamp, or bog forest); 1, moderate grazing or mowing (once a year or less); 2, intensive grazing or mowing (more than once a year); and 3, arable land (directly fertilized or unfertilized). The vegetation and land-use intensity types and the land-use intensity index were estimated from observations and contacts with site managers and local researchers.
Within the sites, we established 1-4 stations 15-500 m apart to maximize the captured environmental variation. Each of the 196 stations were equipped with 3-5 opaque PVC 65 L truncated conical chambers 1.5-5 m apart and an observation well (perforated, 50 mm diameter PP-HT pipe wrapped in geotextile; 1 m in length). From each of the 645 chambers, N 2 O fluxes were measured following the static chamber method 37 using PVC collars (0.5 m diameter, installed to 0.1 m depth in soil). Stabilization of 3-12 h was allowed before gas sampling to reduce the disturbance effect of inserting the collars on fluxes. The chambers were placed into water-filled rings on top of the collars. Gases were sampled from the chamber headspace into a 50 mL glass vial every 20 min during a 1-h session. The vials had been evacuated in the laboratory 2-6 days before the sampling. At least three sampling sessions per location were run within 3 days. Water-table height was recorded from the observation wells during the gas sampling at least 8 h after placement. Soil temperature was measured at the 10 and 20 cm depth.
Soil samples of 150-200 g were collected from the chambers at 0-10 cm depth after the final gas sampling, and transported to laboratories in Tartu, Estonia. The homogenized samples were divided into subsamples for physical-chemical analyses and DNA extraction. The samples for chemical analyses were stored at 4°C and microbiological samples were stored at -20°C. DNA extraction was provided at the Tartu University environmental microbiology laboratory (see details below). Using a PP-HT plastic cylinder, intact soil cores (diameter 6.8 cm, height 6 cm) for the N 2 analysis with the He-O 2 method 38 were collected from the topsoil (0−10 cm) inside 252 chambers at 26 sites, starting from 2014. Samples from different climates were run at respective temperatures. During transport, the soil samples were kept below the ambient soil temperature from which they were collected.
Gas flux analyses. The gas samples were analyzed for N 2 O concentration within 2 weeks using two Shimadzu GC-2014 gas chromatographs equipped with ECD, TCD, and a Loftfield-type autosampler. The N 2 O fluxes were determined on linear regressions obtained from consecutive N 2 O concentrations taken when the chamber was closed, using p < 0.05 for the goodness of fit as a quality threshold for the linear regression. During the quality control, in cases of insignificant regression (p > 0.05 we removed one outlier. If the regression remained insignificant but the flux value fell below the gas-chromatography measuring accuracy (regression change of N 2 O concentration δv < 10 ppb), we included it in the subsequent analyses as a zero value.
The helium atmosphere soil incubation technique 30 was used to measure potential N 2 fluxes from soil cores. The cylinders with intact soil cores were placed into special gas-tight incubation vessels located in a climate chamber. Gases were removed by flushing with an artificial gas mixture (21.0% O 2 , 358 ppm CO 2 , 0.313 ppm N 2 O, 1.67 ppm CH 4 , 5.97 ppm N 2, and He). The new atmosphere equilibrium was established after 12-24 h by continuously flushing the vessel headspace with the artificial gas mixture at 20 mL/min. The flushing time depended on the soil moisture. Incubation temperature was kept similar with the field conditions. The gas-chromatograph (Shimadzu GC-2014) equipped with a thermal conductivity detector was used to measure N 2 concentration in the mixture of emitted gases accumulated in the headspace (start value, 40, 80, and 120 min as final value) of the cylinder after 2 h of closure. The gas concentration in the chambers increased in a near-linear fashion and linear regression was applied for calculation of the fluxes. The flux measurements with r 2 of 0.81 (p < 0.1) or greater were used.
Soil physico-chemical analysis. Plant-available phosphorus (P, NH 4 -lactate extractable) was determined on a FiaStar5000 flow-injection analyzer. Plantavailable potassium (K) was determined from the same solution by the flamephotometric method and plant-available magnesium (Mg) was determined from a 100 mL NH 4 -acetate solution with a titanium-yellow reagent on the flow-injection analyzer. Plant-available calcium (Ca) was analyzed using the same solution by a flame-photometrical method. Soil pH was determined using a 1 N KCl solution; soil NH 4 −N and NO 3 −N were determined on a 2 M KCl extract of soil by flowinjection analysis (APHA-AWWA-WEF, 2005). Total nitrogen and carbon contents of oven-dry samples were determined by a dry-combustion method on a varioMAX CNS elemental analyzer (Elementar Analysensysteme GmbH, Germany). Organic matter content of dry matter was determined by loss on ignition. We determined soil water content (SWC) from dry matter content and empirically established bulk densities of mineral and organic matter fractions.
DNA extraction, DNA library preparation, and sequencing. DNA extraction was performed from 0.2 g of frozen soil samples (homogenized) using the Qiagen DNeasy PowerSoil Kit (12888-100), following the manufacturer's recommendations. DNA concentrations were measured with Qubit™ 1X dsDNA HS Assay Kit using Qubit 3 fluorometer (Invitrogen). Altogether 645 individual soil samples were selected for metabarcoding of bacteria, archaea, and eukaryotes. For bacteria, we used the primers 515F (5′-GTGYCAGCMGCCGCGGTAA-3′) and 806RB (5′-GGACTACNVGGGTWTCTAAT-3′) to amplify the variable V4 region of the 16S rRNA gene 39 . Although these primers co-amplify archaea to some extent, we sought to specifically amplify a longer portion of their 16S rRNA gene to capture their full diversity, using the primers SSU1ArF (5′-TCCGGTTGATCCYGCBRG-3′) and SSU1000ArR (5′-GGCCATGCAMYWCCTCTC-3′) 40 . To amplify a broad range of eukaryotes, we used the primers ITS9mun (5′-GTACACACCGCCCG TCG-3′) and ITS4ngsUni (5′-CGCCTSCSCTTANTDATATGC-3′) that cover the V9 variable region of the 18S rRNA gene and the full internal transcribed spacer (ITS) region 41 . Both the forward and reverse primers were tagged with a 12-base multiplex identifier (MID), except in the case of archaea where only the forward primer was tagged with MID. All PCRs were performed in two replicates using 5 × HOT FIREPol ® Blend Master Mix (Solis BioDyne, Tartu, Estonia) in 25 μl volume. By default, the bacteria, archaea, and eukaryotes were amplified using 25, 35, and 30 cycles, respectively. In case of no amplification, two or five extra cycles were added, or DNA was re-extracted and re-purified. Thermal cycling included an initial denaturation at 95°C for 15 min; 25-40 cycles of denaturation for 30 s at 95°C, annealing for 30 s at 55°C, elongation for 1 min at 72°C; final elongation at 72°C for 10 min; and storage at 4°C. The two replicates of each reaction were pooled and visualized on TBE 1% agarose gel.
The bacterial amplicons were sequenced using the Illumina NovaSeq platform at 2 × 250 bp paired-end mode. Illumina amplicon libraries were generated using TruSeq DNA PCR-Free High Throughput Library Prep Kit with TruSeq DNA CD Indexes (Illumina). To increase identification accuracy and coverage, the archaeal and eukaryote amplicons were sequenced using a long-read sequencing technology on PacBio Sequel II platform 40,41 . SMRTbell library preparation followed the Pacific Biosciences Amplicon library preparation protocol. Metabarcoding analysis was repeated for samples yielding <5000 prokaryotic reads (Illumina), <500 archaeal reads (PacBio), or <1000 eukaryote reads (PacBio).
For the functional metagenome analysis, three replicate soil samples per station were pooled based on equimolar amount of DNA. Library preparation and indexing of each 196 pooled samples was performed using Nextera XT DNA Library Prep Kit in combination with Nextera XT Index kits v2 (Illumina). Metagenomes were sequenced based on the shotgun approach to an expected depth of 5,000,000 reads using Illumina NovaSeq with 2 × 150 bp paired-end mode. The samples with <1,000,000 reads were subjected to resequencing.
Quantitative PCR. We used qPCR to quantity the absolute abundance of bacterial and archaeal 16S rRNA genes as well as the key genes involved in N cycle pathways, including denitrification (nirS, nirK, nosZ clade I, and nosZ clade II), N fixation (nifH), dissimilatory nitrate reduction to ammonia (DNRA; nrfA), ammonia oxidation (bacterial amoA, archaeal amoA, comammox amoA), and anammox-and ndamo-specific 16S rRNA genes ( Supplementary Fig. 7). The qPCR assays were performed using RotorGene ® Q equipment (Qiagen, Valencia, CA, USA). The qPCR method was performed following ref. 34 . Briefly, the qPCR reactions were performed in 10 μL volume containing 5 μL Maxima SYBR Green Master Mix (Thermo Fisher Scientific Inc., Waltham, MA, USA), an optimized concentration of forward and reverse primers, 1 μL of template DNA and sterile distilled water. The gene-specific primer sets, optimized primer concentrations and thermal cycling conditions for each target gene are shown in Supplementary Data 8. The quantification data were analyzed with RotorGene Series Software (version 2.0.2; Qiagen, Hilden, Germany) and LinRegPCR program (version 2020.0) 42 . The gene abundances were calculated as a mean of fold differences between a sample and each 10fold standard dilution in respective standard as recommended by ref. 42 ; gene abundances were reported as gene copy numbers per gram of dry soil.

Bioinformatics
Metabarcoding. Illumina MiSeq sequences were analyzed using LotuS software 43 following ref. 44 . Briefly, the reads were demultiplexed and quality-filtered by trimming individual reads to 170 bp and removing reads with an accumulated error >2 or an estimated accumulated error >2.5 at a probability of ≥0.01. To pass to the next step, each unique read (reads preclustered at 100% identity) was required to be present at least eight times in at least one sample, four or more times in at least two samples, or three or more times in at least three samples. Chimeric OTUs were removed based on both reference-based and de novo chimera checking algorithms as implemented in uchime 45 . The resulting OTUs were taxonomically annotated by aligning their sequences with Lambda 46 to SILVA v135 database 47 and the LotuS least common ancestor (LCA) algorithm (options: -p miSeq derepMin 8:1,4:2,3:3 -simBasedTaxo 2 -refDB SLV -thr 8). For processing PacBio sequencing data, PipeCraft 48 was used as follows. Raw sequencing data was demultiplexed via mothur (version 1.36.1) 49 module in PipeCraft by allowing one mismatch to tag region (i.e. to index sequence that was used for multiplexing); quality filtering was performed using vsearch (version 1.11.1) 50 module with maximum expected error threshold of 1 (--fastq_maxee = 1) and discarding sequences with ambiguous bases (--fastq_maxns = 0); putative chimeric reads were discarded using vsearch uchime_denovo algorithm; prior clustering, full length ITS reads without conservative regions (18S and 28S rRNA genes; i.e. primer-binding sites) were extracted using ITSx software (version 1.0.11) 51 ; using UPARSE (version 8.1.1861), sequences were clustered to OTUs at 98% sequence similarity where singletons (clusters with only one sequence) were removed during the process (minsize = 2). Representative sequences of OTUs were taxonomically annotated based on the best blast hit against UNITE database (version 8) 52 followed by the LCA algorithm. For statistical analyses, we retained 645, 440, and 638 samples that yielded sufficient sequencing depth for Illumina 16S data (bacteria and archaea), PacBio 16S data (archaea) and PacBio ITS (fungi), respectively.
Metagenomics. Analysis of metagenomic reads was done using MATAFILER pipeline 53 . Briefly, reads obtained from the shotgun metagenomic sequencing of peat samples were quality-filtered by removing reads shorter than 70% of the maximum expected read length (150 bp), with an observed accumulated error >2 or an estimated accumulated error >2.5 with a probability of ≥0.01, or >1 ambiguous position. Using sdm software (version 1.46) 43 , reads were trimmed if base quality dropped below 20 in a window of 15 bases at the 3′ end, or if the accumulated error exceeded 2. Altogether 196 samples produced sufficient quantity of reads and were retained for statistical analyses. To estimate the functional composition of each sample, we implemented a similarity search approach using DIAMOND (version 2.0.5; options -k 5 -e 1e-4 -sensitive) in blastx mode 54 . Prior to that, the quality-filtered read pairs were merged using FLASH (version 1.2.10) 55 . The mapping scores of two unmerged query reads that mapped to the same target were combined to avoid double counting. In these cases, the hit scores were combined by averaging the percent identity of both hits. The best hit for a given query was based on the highest bit score and highest percent identity to the subject sequence. Using this method, we calculated the relative abundance of (clusters of) orthologous gene groups (OG) by mapping quality-filtered reads against the eggnog database (version 4) 56 . We also calculated metagenomic relative abundances (i.e. miTag 57 ) of different taxonomic groups based on small subunit (SSU) rRNA genes. For this, SortMeRNA (version 2.0) 58 was used to extract and blast search rRNA genes against the SILVA SSU database (v128). Reads approximately matching this database with e < 10 −4 were further filtered with custom Perl and C++ scripts, and merged using FLASH. In case read pairs could not be merged, the reads were interleaved such that the second read pair was reverse complemented and then sequentially added to the first read. Of these preselected reads, 50,000 reads were fine-matched the Silva SSU database using Lambda and the lowest common ancestor (LCA) algorithm adapted from LotuS.
Genomic analysis. The taxonomic analysis revealed that a few microbial lineages may disproportionally outperform the community functional diversity of microbes in affecting ecosystem biogeochemistry. Thus, we followed a trait-based approach to confirm our findings. We downloaded 385 complete archaeal genomes from NCBI as of 10/7/2020 (search terms: archaea[Organism] AND "complete genome"). These were used to build a reference database for a BlastN search to identify corresponding genomes and functional annotations of our archaeal OTUs. In addition, to better understand the potential functions of different archaeal lineages in N cycling, the functional annotation of all available archaeal genomes was retrieved from the Integrated Microbial Genomes and Microbiomes database (img.jgi.doe.gov) as of 15/7/2020. Data analysis. To account for differences in sequencing depth across samples, diversity indices (Shannon diversity index) were calculated based on rarefied abundance matrices (metabarcoding datasets) in vegan package 59 of R (version 2.5-6). Multivariate analyses were performed using Bray-Curtis dissimilarity on normalized taxa abundance matrices in vegan. All raw P-values of multiple tests were corrected using Benjamini-Hochberg method. Taxonomic abundance data were normalized using Hellinger transformation as implemented in vegan.
To test the effect of biotic variables on N 2 O emissions, we used Spearman correlation analysis components to identify the bacterial and archaeal taxonomic lineages and fungal OTUs most strongly associated with N 2 O emissions. Functional gene (OG) composition and taxonomic community matrices were normalized by library size using Hellinger transformation. We subsequently used partial least squares (PLS) analysis to predict N 2 O emissions based on taxonomic groups, which allows the dimensionality of multivariate data to be reduced into PLS components using plsdepot package 60 of R (version 0.1.17). Prior to this, we performed a backward variable elimination procedure to remove variables with low explanatory power (VIP threshold < 1), as implemented in plsVarSel package 61 of R (version 0.9.6).
For univariate analysis, the best predictors of the diversity and relative abundances of taxonomic and functional groups were identified using a machine learning approach implemented in randomForest package 62 of R (version 4. [6][7][8][9][10][11][12][13][14]. To further test direct and indirect effects of variables in the best model, structural equation modeling (SEM) was used as implemented in piecewiseSEM package 63 of R (version 2.1.0). The prior model was constructed based on our hypothesis (see the section "Introduction"). The optimal model fit was achieved by subsequent iterative revision based on modification indices. For linear relationships, Spearman's rank-correlation coefficient was calculated in R. For fitting non-linear relationships between soil water content and N 2 O emissions, generalized additive model (GAM) were constructed using smoothing parameter estimated by marginal likelihood (REML) maximization, as implemented in mgcv package 64 of R (version 1. . We also compared the goodness of fit estimates between first and second order polynomial models for certain analyses. The best polynomial fit was determined on the basis of Akaike Information Criterion (AIC) scores using "AIC" and "poly" functions of R.

Data availability
All metabarcoding sequences and associated metadata have been deposited in the European Bioinformatics Institute Sequence Read Archive database: https:// www.ncbi.nlm.nih.gov/bioproject/PRJNA718418; metagenomics sequences and associated metadata have been deposited at The European Nucleotide Archive under accession number https://www.ebi.ac.uk/ena/browser/view/PRJEB44414. Additional data generated in this study are provided in the Supplementary Information/Source Data file. SILVA database is available at https://www.arb-silva.de; UNITE database is available at https://unite.ut.ee/repository.php; Integrated Microbial Genomes is available at https:// img.jgi.doe.gov; eggnog database is available at http://eggnog5.embl.de/download/ eggnog_4.0/ Source data are provided with this paper.